K-Means VQ algorithm using a low-cost parallel cluster computing
نویسندگان
چکیده
It is well-known that the time and memory necessary to create a codebook from large training databases have hindered the vector quantization based systems for real applications. To overcome this problem, we present a parallel approach for the K-means Vector Quantization (VQ) algorithm based on master/slave paradigm and lowcost parallel cluster computing. Distributing the training samples over the slaves’ local disks reduces the overhead associated with the communication process. In addition, models predicting computation and communication time have been developed. These models are useful to predict the optimal number of slaves taking into account the number of training samples and codebook size. The experiments have shown the efficiency of the proposed models and also a linear speed up of the vector quantization process used in a two-stage Hidden Markov Model (HMM)-based system for recognizing handwritten numeral strings.
منابع مشابه
A Low-Cost Parallel K-Means VQ Algorithm Using Cluster Computing
In this paper we propose a parallel approach for the Kmeans Vector Quantization (VQ) algorithm used in a twostage Hidden Markov Model (HMM)-based system for recognizing handwritten numeral strings. With this parallel algorithm, based on the master/slave paradigm, we overcome two drawbacks of the sequential version: a) the time taken to create the codebook; and b) the amount of memory necessary ...
متن کاملParallel computing using MPI and OpenMP on self-configured platform, UMZHPC.
Parallel computing is a topic of interest for a broad scientific community since it facilitates many time-consuming algorithms in different application domains.In this paper, we introduce a novel platform for parallel computing by using MPI and OpenMP programming languages based on set of networked PCs. UMZHPC is a free Linux-based parallel computing infrastructure that has been developed to cr...
متن کاملAlgorithmes de classification répartis sur le cloud
The subjects addressed in this thesis are inspired from research problems faced by the Lokad company. These problems are related to the challenge of designing efficient parallelization techniques of clustering algorithms on a Cloud Computing platform. Chapter 2 provides an introduction to the Cloud Computing technologies, especially the ones devoted to intensive computations. Chapter 3 details ...
متن کاملParallel Spatial Pyramid Match Kernel Algorithm for Object Recognition using a Cluster of Computers
This paper parallelizes the spatial pyramid match kernel (SPK) implementation. SPK is one of the most usable kernel methods, along with support vector machine classifier, with high accuracy in object recognition. MATLAB parallel computing toolbox has been used to parallelize SPK. In this implementation, MATLAB Message Passing Interface (MPI) functions and features included in the toolbox help u...
متن کاملDistributed clustering algorithms over a cloud computing platform. (Algorithmes de classification répartis sur le cloud)
The subjects addressed in this thesis are inspired from research problems faced by the Lokad company. These problems are related to the challenge of designing efficient parallelization techniques of clustering algorithms on a Cloud Computing platform. Chapter 2 provides an introduction to the Cloud Computing technologies, especially the ones devoted to intensive computations. Chapter 3 details ...
متن کامل